home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Scene 96
/
Scene 96 International Edition (Zyklop Software) (Disc 2) (1997).iso
/
misc
/
coding
/
gema26a
/
gema_eng.doc
< prev
next >
Wrap
Text File
|
1996-01-04
|
35KB
|
1,172 lines
-(G)-(E)-(M)-(A)-
[G]enPC [E]lite [M]acro [A]ssembler
(C)oderite SECTOR ONE 1994-95
English documentation for version 2.6
I. Introduction
1. Shareware
2. Credits
3. Greetings
II. Generalities
1. Addressing modes
2. Arithmetic
3. Assembly directives
III. Mnemonics
IV. Conclusion
--==--
I. Introduction
_______________
GenPC aka GEMA is a new symbolic assembler for MS-DOS. It is
mainly based upon the 68k reference : GenST on Atari ST. Moreover the logical
structure of Motorola 680x0 was adapted to Intel mnemonics, as it is actually
easier and more logical. Opposed to TASM that wants to get a lousy pseudo-
structurated style and features lotsa bugs ( especially with 386+ instructions
), and doesn't let us really guess how our source codes will be assembled, GEMA
let you enjoy heavy coding and was especially designed for 32-bit processing.
It now supports all the opcodes of Intel processors, from 8086 to P6,
including all discovered, but undocumented opcodes !
In addition, it is really faster than TASM, doesn't need any linker,
and features handy assembly directives, especially INCBIN, that has always been
missing on TASM and MASM.
If you never coded in machine language before, GEMA is the tool you
need to discover the marvellous ( well... just about ) world of 80x86. And
you'll be easily able to learn 680x0 if you need to.
If you already know the 680x0 joys, you won't have to worry about the
lousiness of Intel stuff anymore and won't yell about the lameness of the
classical assembly tools.
If you're coding on 80x86, you must fed up with TASM and MASM. So that
GEMA is the assembler you need ! It is especially designed for 32-bit coding
( protected or flat-real modes ) and is really easy to use in this context,
opposed to the previous references.
A full integrated environment with a powerful editor and debugger
( MonST-like ) are about to be released.
I.1 Shareware
-------------
GEMA is a SHAREWARE. Installing it on your hard disk implies your
agreement to the following terms :
- If you are working in a telco company, you have to clean up my
phone-bills,
- If you think the world of COBOL, you have to kill yourself,
- If you're one of my teachers, you have to gimme good results to the
exams and projects,
- If you're a coder, graphixx-man, music-man, or courrier, love demos,
and don't belong to any crew, you'd better join Sector One,
- If you're a 20-years old female, you must fall in love with me,
- If you can have cheap hardwares, let me know,
- If you find bugs in GEMA, let me also know,
- If you find none, it's because you never used it :) Bad boy,
BTW, this is just a beta-release, it's why some bugs are maybe still
present. Please report them to me by writing down what you exactly did, what
happened, what sould have happened, and the version of your release.
If you like to register as kewl official GEMA user, please send me a
nice letter, telling what you're thinking about it. You can also help me coding
a part of the complete assembly-package. Just ask for the source code and tell
me what you'd like to do on it. You can also send me a few money, at least to
cover the mail charges. The official registration fee was 50 FF or $10.00 .
If you do it :
- You will have a good conscience,
- You will support the author to improve GEMA and code other
sharewares,
- You will get all the new releases by e-mail or snail-mail,
- Your name will be quoted in our greetings-list.
I'd also enjoy receiving anything done with GEMA.
I.2 Credits
-----------
Assembler : design, documentation, coding ........... Jedi/Sector One
Moral support : .................................... Stephanie Mauger
Spelling correction of the french doc : ....................... Mogar
Used software : QEdit, Gema, Hacker's View, DJGPP
Beta-testing : MJS, Altomcat/Sector One, Oxygene, Keops/Equinox, Alexey
Voinov
You can get in touch with me at the following address :
Frank DENIS
2, rue Madeleine Laffite
F-93100 MONTREUIL
FRANCE
Or thru InterNet : j@nether.net
Your can get the latest release by sending an e-mail titled
"GET GEMA" to the previous address, or on ftp.nether.net in the directory :
/pub/gema/* . You can also leech it without any ratio on ACE BBS
+33-1-4588-7548 or thru FidoNet with "GEMA" as keyword on the node 2:320/305.
But please, never phone to me...
I.3 Greetings
-------------
Dream regards to : Infiny ( LCA, Gandalf ), Eclipse ( Hacker Croll ),
CyberPunk, Trash, Dream Syndicate, Underground Tectonics, EKO ( Maxx-Out,
McDo, Createur, Jool ), Eagles ( Ard ), Equinox ( Keops, Checksum, Al Cool ),
Lego System ( Skill ), Dune, Fantasy, Genesis, DBA ( Bonus Software ),
Sentry ( Eagle ), Isiolis, Imphobia, Dead Hacker Society, Control Team,
Quicky, Daniel Bozinov, Michel Furic, Live!, Fantasy, Anixter, Fongus,
Bresil, DSK, Alexey Voinov, Oxygene, Jared, Impact Studios, Kloon, Antares,
Pulse, RealTech, Animal Mine, Oxyron, Max in the Star System, Epsilon, EMF,
Plant, Cascada, Cubic Team, and you !
II. Generalites
---------------
GEMA works on 386, 486, Pentium or P6+. It takes one or two parameters
that are the name of the source code and eventually the name of the executable
file.
eg : gema foo.s
or : gema foo.s bar.exe
If the second parameter is missing, the name of the executable file
will be the name of the source with a COM or EXE suffix.
Some options may prefix the file names :
-E or --preprocess : display each processed line
-v or --verbose : verbose output
-q or --quiet : reduced output
-o or --optimize : extended automatic optimizations ( 3 passes )
-nw or --nowarning : disable warnings
-a or --autoalign : enable automatic alignment ( experimental )
-86, -88, --cpu=86 or --cpu=88,
-186 or --cpu=186,
-286 or --cpu=286,
-386 or --cpu=386,
-486 or --cpu=486,
-586, -pentium, --cpu=586 or --cpu=pentium,
-686, -p6, --cpu=686 or --cpu=p6 : assemble only the opcodes supported
by the designed kind of processor. By default, all instructions from 8086 to
p6 are supported.
II.1 Addressing modes
---------------------
All addressing modes conforms to the Motorola 680x0 format, ie :
Designation Intel GEMA
------------------------------------------------------------------
Short immediate 12 #12.b
Immediate (word) 32000 #32000.w
Immediate (long word) 99999 #99999.l
.b, .w and .l are optional. They allow forcing a type, for instance
to cast a value that fits into one byte as a long world. This can be quite
useful with self-modifying code. If the size isn't explicitely fixed, GEMA
finds automatically the smallest one.
Designation Intel GEMA
------------------------------------------------------------------
Direct AH, BX, ECX, SI, CS AH, BX, CE, SI, CS
Under GEMA, the registers are designed the same way than TASM,
excepted for 32-bit registers that looks like : AE, BE, CE, DE, SIE, DIE, BPE
and SPE, as it's more logical. But in most cases, just one letter ( A, B, C or
D ) is enough. Indeed, like 680x0, the size of operands is explicitely part of
the instruction and GEMA sets them automatically.
So that : NEG.B A under GEMA means NEG AL with TASM.
By default, a ".B" instruction applied to a single-letter register is
interpreted as AL, BL, CL or DL. To mention AH, BH, CH or DH, just write the
full designation. For instance : NEG.B AH
NEG.W A under GEMA means NEG AX with TASM
Excepted for instructions that don't deal with words as operands, the
default size of all mnemonics is the word. So that NEG.W A and NEG A have the
same effects. So do most of the instructions.
NEG.L A under GEMA means NEG EAX with TASM
We can always use the full designation of a register instead of a
single-letter shorthand. For instance, NEG.L EAX is okay. But NEG.L AX will
ware as the operand size is incoherent with the instruction size.
If you ever think this is harder to design explicitely the size in the
mnemonic instead of an implicit guess based upon operand sizes, you understood
nothing to life. 'Coz it's actually clearer like this and you don't have to
add gadgets like WORD PTR in ambiguous cases.
So that : NEG.L (SI,DI) under GEMA means NEG DWORD PTR [SI][DI] with
TASM.
Oh BTW, I was about to forget it... You can always use ".s" instead of
".b" if you prefer... This was just designed to keep on the norms of GenST.
Designation Intel GEMA
------------------------------------------------------------------
Short absolute [12] 12.b
Absolute (word) [32000] 32000.w
Absolute (long) [99999] 99999.l
Once again, the ".b", ".w" and ".l" are optional. They could be
useful only to cast a type. Otherwise, the assembler guess them on its own.
Under GEMA, like in 680x0, and as an immediate value would be
prefixed ( with a "#" ), an absolute address has no prefix. It's also the way
with labels :
NEG label under GEMA means NEG WORD PTR [label] with TASM
Designation Intel GEMA
------------------------------------------------------------------
Indirect [SI] (SI)
Indirect with register [SI+BX] (SI,BX) or (SI,B)
The default size for an index register is a word.
Designation Intel GEMA
------------------------------------------------------------------
Indirect with reg and offset.b [SI+BX+12] 12.b(SI,BX)
Indirect with reg and offset.w [SI+BX+32000] 32000.w(SI,BX)
Indirect with reg and offset.l [ESI+EBX+99999] 99999.l(SIE,BE)
As expected, ".b", ".w" and ".l" are optional, they're only useful
as casts. For instance 12(SI,BX) is exactly equivalent to 12.b(SI,BX). Most of
the time, there's no good adding these suffixes. Offsets can be set as they
are, GEMA will find a way outta difficulties.
Designation Intel GEMA
------------------------------------------------------------------
Indirect reg/off/factor [ESI+EBX*factor+off] off(SIE,BE*factor)
Once again, offsets can be bytes, words or long words. BTW, a factor
can be applied only to 32-bit registers, as well as long offsets ( someone
told me that was a bug of GEMA, but this is not : this is the tricky Intel
architecture ) .
THE *SOURCE* ARGUMENT IS ALWAYS THE *FIRST* ONE AND AN OPTIONAL TARGET
ARGUMENT, ALWAYS THE NEXT ONE.
This is nothing but the intelligent Motorola logic.
So that with GEMA, to copy the value of AX into BX, just write :
MOVE A,B ( or MOV A,B )
... and not MOV B,A as TASM would expect it.
Of course, instructions like ENTER that don't manage source and
destination arguments, have no reason to get reversed arguments. But for the
rest, the Motorola syntax has to be used.
II.2 Arithmetic
---------------
All classical operators are available for offsets and immediate values.
Here is the decrescent list :
[]: These are the parenthesis
- : Opposite to a number
< : Left shift. For instance : 3<2 sends back 12
> : Right shift. 6>1 sends back 2
^ : Exclusive OR
& : Logical AND
| : Logical OR
/ : Divide
* : Multiply
- : Sub
+ : Add
% : Modulo
= : Equal : sends back 0 if the expression is false, 1 otherwise
@ : Divide by 16 what follows. For instance, @fooLabel sends the
segment address of the label Toto, according to the current ASSUME value. This
operator is accumulable, for instance @@fooLabel divide fooLabel by 256. But
it's just an arithmetic operation, there will be no relocation ( see \ above ) .
~ : Logical NOT
\ : Sends back the segment address of what follows. This forces the
result of the expression to a word and will be relocated is the executable code
is a .EXE . It's a kinda SEG prefix like with MASM and TASM.
: : ( Yes it's the ':' character )
Versions < 2.5 : returns the 4 lower bits of what follows,
Versions >=2.5 : returns what follows modulo the current segment
size. Was changed for compatibility purpose with A2G. In practice, this is
quite a dummy operator...
Stupid but funny instance : 2+3*4/[-5-7]<[~3^5]
Apart from numbers, lotsa other things may be used inside arithmetic
operations :
* : Asterisks mean the address of the current instruction, or to be
more precize, its offset to the beginning of the program.
For instance : bra.s *
equals : foo jmp foo
'': ASCII values of one or more characters.
For instance : 'A' equals 65
'AB' equals $4142
Bases :
-------
A raw number is parsed as decimal by default. Even if there are one or
more zeros at the beginning.
Hexadecimal numbers have just to be prefixed by a '$' ( how silly is
the 'h' suffix with sometimes a '0' suffix to avoid confusions ! )
For instance : $ABCD1234 under GEMA means 0ABCD1234h with TASM
A binary constant has to be prefixed by a '%'.
For instance : %101 means 5
An octal number has to begin with the 'º' sign.
Of course, arithmetic operations can mix several bases. And casting
is always possible on all the expression...
For instance : [1+$12A/%10001001].b
...But can also always be done on some terms of an expression :
For instance : $1234.b+1 means $35 ans not $1235 because the ".b"
reduced the result to a single byte.
So can GEMA evaluate complex expressions mixing different bases with
type casts. Results can be used anywhere as constants, offsets or immediates.
But GEMA can also process these operations on symbols.
Symbols :
---------
There are 3 sorts of symbols :
- Labels
------
The global format of an instruction line under GEMA is :
[Label[:]] [Mnemonic] [Arguments] [;] [Comments]
A label has to be always at the beginning of a line ( without any
trailling space ) . So that the classical leading column is now optional.
A mnemonic has not to be directly at the beginning of a line. Spaces
or tabs must prefix it to avoid it being interpreted as a label. Reserved
keywords can always be used as label names, this is not ambiguous.
The comments don't need a semicolumn. But avoid this kinda comments
as it overloads the source code that becomes less readable. And the heavy
parsing technic used by GEMA has sometimes problems to distinguish comments
and arguments. But don't panic, in all cases, you'll just get an error message,
and never an unexpected result.
A complete line of comments has to begin with a '*', ';', '%' or '/' .
For instance :
foo move.l a,b this is a comment
addx (si,bx),c
bar
bra.s foo
* dummy line
/ Another line of comments
junk: nop
Usually, the semicolumn is not necessary to distinguish instructions
and comments. But let you see the following case :
rts returns to the main routine
rts can be used without any argument or with a number corresponding to
an extra depth of the stack. It's why GEMA will think "returns" is that depth
( being a label ) and "to the main routine" is a comment. To avoid this, just
add a semicolumn :
rts ; returns to the main routine
And there will be no possible confusion anymore. Thanks to Altomcat for
having related that problem to me. In case of doubt you can add semicolumns
all the time, this will be ok.
You can also add extra spaces everywhere, GEMA will ignore them.
For instance :
addx.l 4 + 3 / [ 1 + 2 ] ( sie , be * 8 ) , d
But never insert spaces between a mnemonic and its optional size
indicator ( "addx.l" is ok, not "addx . l" ) .
The START label is always defined as a null label, ie. the beginning
of the program and can be used by pedants the same way than NULL in C language.
A label can be used inside an arithmetic expression. In this case, it
represents the offset to the beginning of the program, less the ASSUME value
( see ASSUME above ) + the last ORG value ( see ORG above ) .
For instance :
foo move.l #[foo-bar]/2,foo+2
...
bar flush
Such labels can be defined only once a source code, otherwise would
GEMA produce a redeclaration error. They have a constant value for the whole
assembly, opposed to local labels and variables.
- Local labels
------------
Idem to global labels, excepted the fact they can be redefined. Their
name have to begin with a dot. It's a pretty handy feature for loops.
For instance :
move #$1234,c
.wait dec.c
bne.s .wait bne = jnz
... some other piece of code ...
move a,c
.wait nop
cmps.b
dbeq .wait dbeq = loopnz
Local labels can be used in arithmetic expressions and equals the last
point of their definition.
They can even be set to any value with the SET directive and act like
real numeric constants or counters.
For instance :
.wait set $1234
Oh yeah, that local label has now the $1234 value. But before being
assigned a constant value, it is assigned the instruction offset like any other
label. So what, would you say ? Yup, just try for instance :
.wait set .wait+2
As much as you do that, .wait will always be evaluated as two more
bytes than its last declaration. IMHO, it's really useful for self-modified
code.
But now, let us imagine another situation, where you have to build a
table of the 3 multiples. We'll explain later than the REPT...ENDR structure
allows repeating a piece of source code several times and that DC.L is used to
insert a constant long word ( like DD with TASM ) . What would be interesting
in this case, would be that .wait equals optionally the offset value it was
declared in the first time, and then that it would be only depending of the
SET value for the other times.
Of course, GEMA manages it with "assembly variables".
- Assembly variables
------------------
Their names begin with a '!'. They can be used as traditionnal labels.
They are in fact local labels that are affected by the offset they stand in
only the first time they're assigned. So here is the way we can build our table
of multiples ( 0, 3, 6, 9, ... 255*3 ) :
!foo set 0
rept 256
dc.l !foo
!foo set !foo+3
endr
Marvellous, isn't it ? Any arithmetic expression can mix these 3 sorts
of symbols and can even cast them to any type.
Reserved keywords are supported, and there is no limit on significative
length. Upper and lower cases are differencied. Allowed character for a label
name are alphanumerics ( of course with no digit as a first character ),
underscore, '!' and '.' .
II.3 Assembly directives
------------------------
They have to be positionned to the second position, like mnemonics.
- REPT <constant> ... ENDR
------------------------
Repeat <constant> times a piece of code.
For instance :
rept 5
nop
xlat
endr
will produce : nop xlat nop xlat nop xlat nop xlat nop xlat
- SET
---
See the previous part.
- ORG <constant>
--------------
Set the base offset. Exactly like equivalents on all other assemblers,
excepted the fact that GEMA supports more than one, and anywhere in a program
( even though it is interesting only at the beginning ) .
For instance :
org $100
GEMA will also agree with org #$100
- TITLE <title>
-------------
Gives a title to the current source. Yet unused.
_ INCLUDE <file>
--------------
Merge the file at this point of the current source and continue the
assembly procedure ( like #include in C ) . Any source can include another one
that may include other ones that may... There is no depth limit, but a basic
check for circular references is done. The file name can be quoted or double-
quoted ( or not quoted at all ) .
- USE16
-----
Assume that the following code will be in a 16-bit code segment by
default ( ie. needs prefixes for 32-bit accesses ). Enabled by default.
- USE32
-----
Assume that the following code will be in a 32-bit code segment by
default. This is only possible in protected mode, and needed by most of the
DOS-Extenders ( like PMODE ) .
- OPT
---
Enable or disable several options. Overrides command line options.
OPT o+ : enable automatic optimizations
OPT o- : disable them ( default )
OPT w+ : enable all warnings ( default )
OPT w- : disable them
OPT v+ : verbose mode
OPT v- : normal mode ( default )
OPT q+ : quiet mode
OPT q- : normal mode ( default )
OPT a+ : enable automatic alignment
OPT a- : disable automatic alignment ( default )
- ONCE
----
Like #pragma once implemented in most C compilers. Don't include the
file if it was already included before.
- INCBIN <file>
-------------
Here comes THE missing directive on TASM and MASM. It allows you to
insert any BINARY file in the executable code. Forget lousy hexadecimal
conversions or tiny files loading. To insert the picture of your girlfriend at
the "tut" label, you now just have to do :
tut incbin cindy.jpg
or
tut incbin "cindy.jpg"
- DC
--
Inserts a byte, a word, a long word or a string.
For instance :
dc.b 1,2,3,4,"Tototata",'t',10
dc.w $1234,"tuttut",4
In the last example, "tuttut" is inserted as single bytes as if we had
done :
dc.w $1234
dc.b "tuttu"
dc.w 4
- DS
--
Inserts several nul bytes.
For instance :
ds.b 4 means dc.b 0,0,0,0
ds.l 3 means dc.l 0,0,0
- EVEN, ALIGN, SEGMENT, PAGE, DPAGE, PPAGE
----------------------------------------
Inserts several nops so that the next instruction will be aligned.
EVEN = 2 bytes
ALIGN.B / .W / .L / .Q = Try and guess ( .Q = Quand, .B is just here
for fun ! )
SEGMENT = 16 bytes
PAGE = 256 bytes
DPAGE = 512 bytes
PPAGE = 2048 bytes
- MIN or MINI <xxx>
-----------------
Minimal size your program need ( .EXE files only ) in 16-byte chunks.
Default is the size of your program.
- MAX or LIMIT <xxx>
------------------
Fix the maximal memory size your program has to reserve when loaded,
in 16-byte chunks. Useful for overlays and resident programs.
- OVERLAY <xxx>
-------------
Set an overlay ID.
- HEADER
------
Inserts the traditionnal header of a .EXE file, with the relocation
table and all that stuff. It's usually the first instruction of a .EXE program.
But GEMA allows you to put ones everywhere you want ( might be useful for self-
extracting archives or nested executables ) .
- STACK
-----
Describe where the initial stack ( SS:SP ) will be allowed when a .EXE
file will be launched. All .EXE files should have a STACK.
For instance :
header
* lotsa code
ds.l 256
* 256*4 bytes for the stack, that's enough !
stack
* other code or datas
- ASSUME
------
Fix the reference to compute the next label offsets inside arithmetic
expressions. TASM allows this directive having different values for all segment
registers. Well... IMHO there is no good doing that as we sould be able to do
anything we need with our segment pointers, but maybe this feature will be
implemented in the next releases of GEMA if requested by enough people.
- FATAL
-----
Stop assemblying.
- SECTION BSS or BSS
------------------
What follows will be "passively" assembled : the inline labels will
be computed but no code will be integrated in the executable file. Usually
used with "ds" at the end of a program.
- SECTION TEXT or SECTION DATA or TEXT or DATA
--------------------------------------------
Cancel the effects of the previous directives.
- REAL or REALMODE
----------------
Assume that next segments ( from where one of these directives is, and
after SEGMENT, PAGE, DPAGE or PPAGE ) are 64 Kb long like in real-mode. These
directives are only useful to produce an error in case of overflow. They have
no consequence on code generation.
- UNREAL or UNLIMIT or FLAT
-------------------------
Opposed to the previous directives, this set assume that the next
segments ( until a new directive of this kind is encountered ) have no size
limit. By default, all segments have an infinite size for GEMA.
- SEGSIZE <size>
--------------
The following segments will be <size> bytes long.
These three sets of directives may be prefixed with an alignement
directive ( SEGMENT, PAGE, DPAGE or PPAGE ) or can be themselves prefixes to
any instruction.
For instance :
SEGMENT:REAL
is exactly like :
SEGMENT
REAL
SEGSIZE 4096:PAGE
means :
SEGSIZE 4096
PAGE
In both cases, the order has no influence. So that DPAGE:UNREAL and
UNREAL:DPAGE have the same meaning.
Theses directives are usually useful in real mode or in protected mode
with funny segment sizes. In all other cases, there is no good using them, as
GEMA assume all segments are infinite by default.
III. Mnemonics
--------------
All ( but two ) TASM and MASM mnemonics are available with the same
designation under GEMA, even all synonyms ( such as JZ and JE ) . Some
mnemonics have different names with TASM and MASM ( for instance XLAT and
XLATB ) : both designations are supported by GEMA and have of course the same
effect.
But GEMA also features alternative synonyms. Most of them are 680x0
equivalents or more logical forms.
The following list represents some equivalent mnemonics and those that
need some extra comments :
LEAVE = UNLINK
MOV = MOVE
MOVSX = MOVESX
MOVZX = MOVEZX
TRAPV = INTO
WBINVD = FLUSH
TRAP = INT
TRAP supports an abolute way of writing its argument.
For instance, TRAP #14 is exactly identical to TRAP 14 or INT 14.
RTED = RTID = IRETD
RTE = RTI = IRET
BRAF ~ JMPF
JMPF is the FAR alternative of JMP. It is waiting for two arguments
that are the segment and the offset, for instance JMPF $14c9,$418db2a.
But as we often use a FAR JMP with a label address or an absolute
address, writing "JMPF \label,:label" all the time would be actually lousy.
Instead, just use BRAF, that works similar to JMPF excepted the fact
it expects only one argument, a 32-bit address that it automatically converts
to a segment and an offset.
For instance, BRAF $12345 is similar to JMPF $1234,5
386+ allow use of these instructions with a 32-bit offset.
BRA ~ JMP
BRA is similar to JMP excepted the fact that, as expected, JMP label
under GEMA means JMP [label] to TASM, ie. a jump to the address contained in
"label", and not a direct jump to "label". It's why a logical way of doing
this is JMP #label. But as "JMP #label" are often more useful than "JMP label",
you'd rather use BRA, that is similar excepted in this case. "BRA label" means
"BRA #label" or "JMP #label". In other cases, BRA can deal with all the
addressing modes a JMP would support, such as "BRA (si, dx)". A BRA or JMP can
be short, word or long ( for flat-real and protected modes ) .
REP = REPE = REPZ
There are two way of using these prefixes :
- Either as independant instructions on a single line, keeping
all the features of any instruction,
- Or as prefixes. In this case, as any good prefix, they have
to be placed before the instruction they act on. A column must separe both
elements.
For instance :
toto ds
gs: move.l (si),a
rep : outs.b
repne
ins.l
As you can see, you can add extra spaces before and after the column
without any problem, as always.
HLT = STOP
XOR = EOR
CMC = NGC
CLD = D+ ( '+' means : increment )
STD = D- ( '-' means : decrement )
CLI = INTOFF
STI = INTON ( Saturn coders will love these ! )
ADDX = ADC
BS+ = BSF ( idem, IMHO, a '+' is clearer than (F)orward )
BS- != BSR
WARNING : BSR HAS NOT THE SAME MEANING UNDER GEMA AND UNDER TASM&MASM.
Indeed, it is used to call subroutines as we'll explain it later. So
that to scan bits in reversed mode, you MUST use BS-.
BTSTC = BTSTN = BTC
BTSTR = BTR
TAS = BTS = BTSTS
BTST = BT
(BSRF = JSRF) ~ CALLF
There are the FAR version of CALL. The way is the same as JMPF and BRAF
but the instructions that only expect one argument are BSRF and JSRF. The CALL
instruction as we always knew it is logically named CALLF.
It is absolutely RIDICULOUS declaring subroutines with PROC NEAR or FAR
foo in machine language. Assembly was designed for heavy coders wanting to make
their compi blow out their silicon minds and not structured-languages fans that
would better have to try PASCAL or another shit like that.
So that under GEMA, you just have to do a BSRF or BSR, and you know
exactly how it will be assembled.
(BSR = JSR) ~ CALL
NEAR version of CALL. See BRA and JMP.
RTS = RTN
RTSF = RTNF
NEAR and FAR ways of coming back from a subroutine. May be followed by
an immediate to enforce the stack depth.
EXTA.Q = EXT.Q = CDQ
EXTA = EXT or EXT.W = CBW
CWD = EXT.L = EXTA.L
DIVS = IDIV
DIVU = DIV
LINK = ENTER
WAIT = FWAIT ( Microsoft / Borland )
MULS = IMUL
MULU = MUL
INS, OUTS, MOVES, LODS and co. :
The string concerns have to follow the GEMA's logic, that is an
instruction optionally followed by its size, being .W by default.
So that instead of OUTSD with TASM, just write : OUTS.L
BHI = JNBE
BCC = JAE = JNB = JNC
BCS = JNAE
BLS = JBE = JNA
BGE = JGE = JNL
BVC = JNO
BLT = JNGE
BLE = JLE = JNG
BCXZ = JCXZ
BEQ = JE = JZ
BGT = JNLE
BECXZ = JECXZ = JCEZ = BCEZ
BPL = JNS
BNE = JNE = JNZ
BPO = JPO = JNP
BVS = JO
BPE = JPE = JP
BMI = JS
I maybe forgot some, but are equivalent :
- All synonyms recognized by MASM and TASM
- Their Motorola 680x0 equivalent
SETxx = Sxx
Idem. For instance, SZ can be used as well as SEQ ... All Microsoft,
Borland and Motorola ways are recognized.
LOOPE = LOOPZ = DBEQ
LOOPNE = LOOPNZ = DBNE
LOOP = DBF = DBRA
ROXL = RCL
ROXR = RCR
SAHF = SAF
ASL = SAL = SHL
ASR = SAR ( != SHR )
SUBX = SBB
XLAT = XLATB
Of course, whatever synonym you choose, the addressing modes keeps the
GEMA's norm.
- When an immediate value is always awaited, removing the '#'
is tolered,
- When the register size depends from the instruction size
( 95% of the usual cases ), you can use one-letter designations for registers,
- When a source and target arguments are involved, they have to
be always in this order,
- Instructions featuring a 32-bit alternative should active it
with ".L" or it is automatically done with 32-bit offets or immediates.
Anyway, when there is no possible ambiguousness, GEMA tolers some
abuses ( like INT 14 that normally should be only INT #14 ), and in all cases,
the most logic way is assumed. And in case of trouble, lotsa pretty precize
errors and warnings will help you.
AAM and AAD
These instructions may be followed by an immediate number ( with or
without a # ) and allow any divisor ( and not only 10 ) . This works on all
CPU, but was never officially documented.
SALC
ICEBP = ICE01 = TRAP01
UMOV = UMOVE
LOADALL
Undocumented opcodes of 386+. See www.x86.org for more informations.
All these opcodes are implemented in GEMA, but a warning appears when
the verbose ( -v or --verbose ) option is enabled.
CMOV = CMOVE
RDPMC
UD
UD2
New opcodes of the P6 CPU, fully implemented in GEMA. As UD and UD2
seem to be working on other sort of CPU, too, they never generate any warning.
Here is a very complex and original program, displaying "Hello World"
on your screen :
org $100
push cs
pop ds ds = cs
move #plouf,d plouf's offets in dx
move.b #9,ah 9 in ah
trap #$21 system call #$21
move #$4c00,a exit(0)
trap #$21 pooeek !
plouf dc.b "Hello world !",13,10,'$' the marvellous text we want to display
Oh god ! Assembling this tiny piece of source code will produce a .COM
file displaying an odd message with the ugly default DOS font.
Here is another version of it, ridiculously longer. But this shows the
structure of an executable ( .EXE ) program. At least the half of the following
instructions are dumbshit, but this can help you understand the GEMA's point of
view on segmentation :
header create a .EXE and not a .COM
overlay $1234 overlay number ( not very useful here )
min 1+@fin minimal necessary memory ( implicit )
max 1+@fin ...equals the max one ( just for fun )
move cs,a code segment address
move a,ds ...into ds
move.b #9,ah fonction 9
move #plouf,d plouf's offset into DX
trap #$21 call the DOS
move #$4c00,a return code to the DOS
trap #$21 let's call it
plouf dc.b "Hello world !",13,10,'$' text to be displayed
segment alignment to a multiple of 16 bytes
ds.l 128 some space for the stack ( 128 Lwords )
fin stack the new stack starts here
If despite this summary you still have troubles using GEMA, please
report them to me, the faster way being thru e-mail to j@nether.net
IV. That's all folks
--------------------
I hope you'll be able to use this fabulous ( well... just about ) tool
and see its advantage to TASM and MASM. All suggestions, critics, remarks and
bug-reports will be welcomed.